Embedded Systems

UltraTrail: A Configurable Ultra-Low Power TC-ResNet AI Accelerator for Efficient Keyword Spotting

by Paul Palom­ero Bernardo, Christoph Gerum, Adrian Frischknecht, Kon­stan­tin Lübeck, and Oliver Bring­mann
In 2020 IEEE Trans­ac­tions on Com­puter-Aided De­sign of In­te­grated Cir­cuits and Sys­tems (TCAD), pages 1-12, 2020.

Ab­stract

Re­cent ad­vances in ma­chine learn­ing show the su­pe­rior be­hav­ior of tem­po­ral con­vo­lu­tional net­works (TCNs) and es­pe­cially their com­bi­na­tion with resid­ual net­works (TC-ResNet) for in­tel­li­gent sen­sor sig­nal pro­cess­ing in com­par­i­son to clas­si­cal CNNs and LSTMs. In this paper, we pro­pose Ul­tra­Trail, a con­fig­urable, ul­tra-low power TC-ResNet AI ac­cel­er­a­tor for sen­sor sig­nal pro­cess­ing and its ap­pli­ca­tion to ef­fi­cient key­word spot­ting. Fol­low­ing a strict hard­ware/model co-de­sign ap­proach, we have de­rived an op­ti­mized low-power hard­ware ar­chi­tec­ture for gen­er­al­ized TC-ResNet topolo­gies con­sist­ing of a con­fig­urable array of pro­cess­ing el­e­ments and a dis­trib­uted mem­ory with dy­namic con­tent re-al­lo­ca­tion. We ad­di­tion­ally ex­tend the net­work with con­di­tional com­put­ing to re­duce the num­ber of op­er­a­tions dur­ing in­fer­ence and to pro­vide the pos­si­bil­ity for power-gat­ing. The final ac­cel­er­a­tor im­ple­men­ta­tion in Glob­al­foundries’ 22FDX tech­nol­ogy achieves a power con­sump­tion of 8.2 µW for the task of al­ways-on key­word spot­ting meet­ing the real-time re­quire­ment of 100 ms per in­fer­ence with an ac­cu­racy of 93 % on the Google Speech Com­mand Dataset.